Generalized Random Forests

نویسندگان

  • Susan Athey
  • Julie Tibshirani
  • Stefan Wager
چکیده

We propose generalized random forests, a method for non-parametric statistical estimation based on random forests (Breiman, 2001) that can be used to fit any quantity of interest identified as the solution to a set of local moment equations. Following the literature on local maximum likelihood estimation, our method operates at a particular point in covariate space by considering a weighted set of nearby training examples; however, instead of using classical kernel weighting functions that are prone to a strong curse of dimensionality, we use an adaptive weighting function derived from a forest designed to express heterogeneity in the specified quantity of interest. We propose a flexible, computationally efficient algorithm for growing generalized random forests, develop a large sample theory for our method showing that our estimates are consistent and asymptotically Gaussian, and provide an estimator for their asymptotic variance that enables valid confidence intervals. We use our approach to develop new methods for three statistical tasks: non-parametric quantile regression, conditional average partial effect estimation, and heterogeneous treatment effect estimation via instrumental variables. A software implementation, grf for R and C++, is available from CRAN.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using generalized additive models and random forests to model prosodic prominence in German

The perception of prosodic prominence is influenced by different sources like different acoustic cues, linguistic expectations and context. We use a generalized additive model and a random forest to model the perceived prominence on a corpus of spoken German. Both models are able to explain over 80% of the variance. While the random forests give us some insights on the relative importance of th...

متن کامل

A generalized inverse for graphs with absorption

We consider weighted, directed graphs with a notion of absorption on the vertices, related to absorbing random walks on graphs. We define a generalized inverse of the graph Laplacian, called the absorption inverse, that reflects both the graph structure as well as the absorption rates on the vertices. Properties of this generalized inverse are presented, including a matrix forest theorem relati...

متن کامل

PageRank and random walks on graphs

We examine the relationship between PageRank and several invariants occurring in the study of random walks and electrical networks. We consider a generalized version of hitting time and effective resistance with an additional parameter which controls the ‘speed’ of diffusion. We will establish their connection with PageRank . Through these connections, a combinatorial interpretation of PageRank...

متن کامل

Randomer Forests

Random forests (RF) is a popular general purpose classifier that has been shown to outperform many other classifiers on a variety of datasets. The widespread use of random forests can be attributed to several factors, some of which include its excellent empirical performance, scale and unit invariance, robustness to outliers, time and space complexity, and interpretability. While RF has many de...

متن کامل

Detecting gene-by-smoking interactions in a genome-wide association study of early-onset coronary heart disease using random forests

BACKGROUND Genome-wide association studies are often limited in their ability to attain their full potential due to the sheer volume of information created. We sought to use the random forest algorithm to identify single-nucleotide polymorphisms (SNPs) that may be involved in gene-by-smoking interactions related to the early-onset of coronary heart disease. METHODS Using data from the Framing...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017